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Abstract. We present a novel method capable of creating optimal eigen- 
spectra from multicolor redshift surveys for photometric redshift estima- 
tion. Our iterative training algorithm modifies the templates to represent 
the photometric measurements better. We present a short description of 
our algorithm here. We show that the corrected templates give more pre- 
cise photometric redshifts, essentially a "free" feature, since we were not 
fitting for the redshifts themselves. 



1. Introduction 

In the last couple of years photometric redshift estimation has become a very 
powerful tool in getting statistical information about our universe and also fre- 
quently used for spectroscopic target selection to measure spectra of faint galax- 
ies. Different techniques have been developed since Baum ('62) but all of them 
may be classified into two groups based on the approach they use. 

The first are the so called empirical methods (Connolly et al. '95a). Having 
a set of objects with spectroscopic and photometric data, one can easily obtain 
an analytic, usually polynomial fitting function for the redshift over the color 
"hyperplane" . Lower order piecewise fits can be also applied efficiently (Brunner 
et al. '99). The function can be very quickly evaluated at any photometric 
data point. However, this method has some disadvantages. We need to have 
a very good training set to get the fitting formula and we still will not be able 
to extrapolate far from the training set. The fitting formula only works for 
a specific set of passbands and limited redshift ranges, thus if we want to get 
photometric redshifts for different catalogs, we need training sets for all the 
photometric standards, so it is very hard to get consistent redshift estimations 
for them. 

In the template or SED (spectral energy distribution) fitting methods a set 
of restframe spectra are used to work out what spectrum and redshift give better 
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representation of the color of a galaxy. Using this approach gives us not just 
the redshift of the galaxy but also an approximate spectral type information. 
The technique works very well on separate photometric catalogs with different 
filter sets but it is obvious that the basis spectra are crucial. The currently 
used template spectra come from both synthetic (Bruzual & Chariot '93) and 
measured (Coleman, Wu & Weedman '80) spectral libraries but generally they 
are applied as they come with the hope that they are good enough. An advanced 
version of the template fitting algorithm has been developed to use "continuum 
number"' of spectra using eigenspectra instead of a large number of actual spectra 
in a very efficient way in terms of CPU time and memory. Using eigenspectra 
(Connolly et al. '95b) means we consider all spectra to be approximately a linear 
combination of a small number of spectra, the eigenspectra. It also turns out 
there is an optimal subspace filtering method (see AJC's talk; Budavari et al. 
'99) which gives more reliable restframe spectra than direct coefficient fitting. 

In this paper we describe a new method which has been developed to bring 
together the advantages of both techniques. We show how to train eigenspectra 
in an iterative procedure to represent the photometry better. In the training 
procedure different catalogs and also measured spectra can be incorporated. The 
resulting eigentemplates can be used for redshift estimation on any photometry 
catalog and the redshift prediction becomes more accurate. 

2. Template Reconstruction 

Our goal is to find out the underlying basis template spectra that can be used 
later for photometric redshift estimation. Since we observe galaxies at differ- 
ent redshifts, their restframe spectra are sampled at different wavelengths by 
the blueshifted filters. If we have a deep enough multicolor redshift survey (or 
more) then the rough SEDs defined by the blueshifted passbands are overlapping 
with each other. This means the eigenspectra are oversampled and this is the 
fact which allows us to derive applications for extracting high resolution eigen- 
templates from broadband photometry (Csabai et al. '99). Our new approach is 
different mainly in spectrum repairation that we introduce in the next section. 

The iterative training procedure follows a fully statistical approach. Its 
robustness is provided by the Karhunen-Loeve (KL) expansion (Karhunen '47; 
Loeve '48). This transformation has been used to derive eigentemplates from 
spectra and this is just what we do in our iterative training. Starting from 
a set of eigenspectra we can compute a best fitting type for each and every 
galaxy in our photometric catalog based on the known redshift. This gives us 
an approximation to the restframe spectra of the galaxies. Now we can check 
how good the spectrum is at least at such wavelengths where photometric data 
are available. We modify — repair — "smoothly" the spectra to represent the 
measured values better. This spectrum correction is the only tricky step that 
we will describe later on. Having all the modified spectra we can invoke the KL 
transformation and obtain a new set of eigentemplates. The iteration can be 
continued until the correlation between simulated and measured flux values is 
strong enough. 
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Figure 1. Correlation between spectroscopic and photometric red- 
shifts. On the right - using 3 CWW eigenspectra, left - using 3 KL 
eigenspectra after 30 iterations. See Figure ^. for corresponding eigen- 
spectra. 

Repairing Spectra. The way we modify the spectrum of a galaxy is the key 
step of our training algorithm. The best fitting linear combination of the eigen- 
spectra is the current approximation of the galaxy that we are about to correct 
due to the photometric values. This is a multidimensional minimization problem 
for the spectrum itself, where the dimension of the problem is determined by the 
spectral resolution. The cost function is built up by two terms. The first part 
corresponds to the deviation of the spectrum based on simulated fluxes from the 
measured photometric data and the second part is the deviation of the spectrum 
from the template based linear combination. 

= E -^ifn - Lm' + E Ai^fc - 4}' (1) 

s is the discrete representation of the spectrum and Sk is the kth element of s, 
where k refers to the wavelength, A^. stands for the template based spec- 
trum and fjfc describes the ability of the spectrum to be changed at a given 
wavelength. The photometric error in the nth. passband is represented by ^ri- 
This minimization problem can be reduced to a system of linear equations, since 
fn{s} is linear in its variables, {sk}', so it is easy to solve. The new KL basis 
built from the modified spectra will span a different subspace which represents 
the photometry better. The details of this method will be described elsewhere 
(Budavari et al. '99). 

3. Application 

Our training algorithm has been applied to the HDF/NICMOS catalog (Williams 
et al. '96; Dickinson et al. '99) which has unique photometric data quality and 
there is a reasonable number of objects with spectroscopic redshift (Cohen, '98). 
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Figure 2. Redshift limited scatter plot showing a close-up of Figure ||. 



There are photometric data available in seven passbands: 4 HDF filters (F300W, 
F450W, F606W, F814W), 2 NICMOS filters (FllOW, F160W) and a ground- 
based K'. Our initial eigentemplates were derived from the Coleman, Wu & 
Weedman (CWW) spectra that usually provide better redshift estimation than 
others (Fernandez-Soto et al., '99; Hogg et al., '98). 

After a few iterations the redshift scatter becomes tighter while the eigen- 
templates just slightly change. The comparison between the original and re- 
sulting eigenspectra after 30 iterations (KL-30 ) can be seen on Figure |^. This 
tiny difference between them was enough to improve the redshift scatter plot 
by a factor of two as it can be seen on Figure |. and |2[ The corresponding 
statistical errors were calculated for two redshift ranges. The overall statistics 
and a redshift limited sample were evaluated. Beyond the standard root mean 
square error (Arms) the relative deviation of (1 + z) was also computed (Arei) 
for comparison with estimates found by other authors. Table |l[ contains the 
calculated error estimates for both the CWW and KL-30 cases. 



Table 1. Comparison of the errors in the fits based on the Coleman, 
Wu & Weedman and the KL (after 30 iterations) eigenspectra 



Error 


Range 


CWW 


KL-30 


Arms 


z <6 


0.23 


0.12 


Arel 


z <6 


0.079 


0.042 


Arms 


z < .8 


0.087 


0.063 


Arcl 


z< .8 


0.050 


0.035 



In order to test the robustness of the method we started the iteration from 
arbitrary constant functions, as well. After just a couple of iterations the 4000A 
break was visible and other important features emerged soon from scratch. 
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Figure 3. These figures show the initial eigenspectra derived from 
the Coleman, Wu & Weedman spectra (dotted line) and the corrected 
KL spectra after 30 iterations (solid line). The horizontal dashed line 
signals the zero level. See Figure |^. and ^. for related redshift scatter 
plots. 

The limitation on our current application is the relatively small number of 
galaxies with accurate multicolor photometry and redshifts in the Hubble Deep 
Field. Consequently the third eigenspectrum appears to be being over fit (we 
run out of degrees of freedom) . As our technique is constructed to be applied to 
any set of multicolor observations we can incorporate many multicolor datasets 
from different sources (i.e. we do not need to train the relation for a given set of 
filters or magnitude limits). With the new generation of multicolor photometric 
and spectroscopic surveys nearing completion we, therefore, expect the accuracy 
of the derived spectral energy distributions to improve dramatically in the near 
future. 

4. Conclusions 

We have shown that the current templates used in SED fitting photometric red- 
shift estimation can be modified to give better correlation between the actual 
photometry and the template based simulated fluxes. We presented a robust 
method that creates better eigenspectra in an iterative way and converges very 
quickly. The training procedure is very generic. Different catalogs can be incor- 
porated at the same time even with different filtersets. It is also easy to extend 
the method to involve measured spectra if needed. 

We showed that the new modified eigentemplates give us a better basis for 
photometric redshift estimation, even though the training procedure is not fitting 
for the redshift, in fact the photometric redshift estimation is only performed 
for monitoring purposes. 
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